DiscoverBilawal Sidhu PodcastRoblox’s Cube Model: Creating Interactive 4D Worlds | VP of AI Explains
Roblox’s Cube Model: Creating Interactive 4D Worlds | VP of AI Explains

Roblox’s Cube Model: Creating Interactive 4D Worlds | VP of AI Explains

Update: 2025-05-13
Share

Description

How does Roblox use AI to power a 3D platform for hundreds of millions of users? VP of AI, Anupam Singh dives into their new Cube AI model, integrations with LLMs for complex 3D/4D world-building, and the future of vibe coding and in-experience creation on Roblox.

Watch this episode on YouTube, X/Twitter, or Spotify.

Topics Covered:

* Roblox is the "YouTube of 3D"

* The Roblox Vision: 3D Creation & Consumption at Scale

* The AI Backbone: Safety, Moderation & Infrastructure at Roblox

* Cube AI: Towards a Foundational Model for 3D

* Using Cube AI with 3P Large Language Models (LLMs)

* The Journey to 4D: Crafting Truly Interactive Worlds

* The Rise of "Vibe Coding" at Roblox Scale

* Empowering Players: In-Experience 3D Creation

* AI's Impact on the Creator Economy

* Powering Discovery: Roblox's AI Recommendation Engine

* The Evolving Landscape: The Future of UGC and AAA 3D Content

* Anupam's Advice for Aspiring Creators & Developers

Links to Roblox Releases:

* Roblox Cube AI: https://corp.roblox.com/newsroom/2025/03/introducing-roblox-cube Cube AI

* Cube GitHub Repo: https://github.com/Roblox/cube

* Voice Classifier: https://github.com/Roblox/voice-safety-classifier

Get in touch:

* Join My Newsletter: https://spatialintelligence.ai

* Connect with me on X/Twitter here: https://x.com/bilawalsidhu

* Everywhere else here: https://bilawal.ai

* Business inquiries: team@metaversity.us

Interview Transcript:

Bilawal Sidhu: Ever wonder how the metaverse gets built? Not just the idea, but the worlds and experiences millions dive into daily.

I'm not talking hypotheticals. Roblox is a colossal platform. In late 2024 alone, 85 million people jumped into Roblox daily, spending an average of two and a half hours in user-generated worlds. Roblox isn't just a game, I call it the YouTube of 3D experiences. And the opportunity is massive.

Last year, creators earned nearly a billion dollars on Roblox. But here's the catch: making 3D content is still really hard. Imagine what'll happen when you slash those barriers to entry, just like we've seen in video creation. The phone in your pocket is practically a visual effects studio.

Today I got something special for y'all. We're sitting down with Anupam Singh, VP of Engineering at Roblox, the man leading the charge on AI and ML at this amazing company that is building the literal instantiation of the metaverse.

Roblox recently announced their Q model, as well as their plans for building a 3D foundational model for this specific purpose of creating interactive 3D worlds. So stick around to hear why they went down this autoregressive approach to tokenize 3D, allowing them not just to predict words, but predict shapes. And how they're building towards this future where one AI model can understand geometry, textures, full body rigging, interactivity, enabling true 4D creation.

We'll also dive into how they're using their own 3D models with the reasoning power of any large language model to build richer, more complex worlds faster than ever before, allowing you to literally speak 3D worlds into existence. And lastly, scale. This isn't just a toy project. This is Roblox building something for their hundreds of millions of users. So if you want to understand the future of 3D and 4D creation, you're not going to want to miss this conversation. Let's get into it.

Anupam Singh: Team is very excited that I'm talking to you. They almost wanted to re-brief me on our technical work, thinking that you're talking to Bilawal, you need to be briefed. I'm like, I've been there since day one when we started this effort.

Bilawal Sidhu: (Laughs)

Anupam Singh: My name is Anupam Singh. I'm VP Engineering here at Roblox with responsibility for infrastructure, the AI platform, discovery, ads engineering, and many other things. Been here at Roblox for three and a half years. Uh, two-time entrepreneur before that. Uh, to summarize my career, uh, it's been about reading some great paper, uh, uh, which is super geeky at at its time, like MapReduce or Transformer, and then spending 10 years trying to make it production-worthy and getting it to billions of dollars in revenue and billions of users.

Bilawal Sidhu: Yeah, don't tell the researchers that. They think it's the zero-to-one innovation, but how do you get that thing out to market at scale?

Anupam Singh: We have those on our team. Uh, you know, we, we have, uh, the person who wrote the ControlNet paper as an advisor, uh, on, on the Cube team, and I always joke with him that it, it'll take 10 years for us to even understand all the implications of ControlNet, for example. Um, but for, for the researchers, it's always very obvious. The future is very clear and obvious, and then it falls to engineers like us to make it actually happen.

Bilawal Sidhu: So speaking of that, Roblox is such an interesting application and ecosystem. I've been calling Roblox the YouTube for 3D experiences because it has that, like, closed loop between creation and consumption. But unlike video, 3D has historically been super challenging and very high barrier to entry. But that's changing. Tell me about what Roblox is doing to shatter these barriers.

Anupam Singh: I think it goes back to almost our founding principle. Uh, uh, we have this principle called Long View, and, uh, since its founding, uh, Dave, our, Dave Baszucki, our founder, has always tried to make it easier and easier. Let's say Bilawal wants to create a 3D game today. Of course, the core coding is hard. You, the core imagination loop is yours, but then you don't know how to get traffic, you know? Um, but if you publish it on the Roblox game, the discovery system will start seeding it with some people, some, some players, and see if they're getting engagement, and then the flywheel starts happening. So the proudest thing for us is when somebody creates a game and within 30 days, they found their audience. So distribution and infrastructure, uh, are the big things. Now, the third leg of the stool is, of course, AI, that's what we're going to talk about.

Bilawal Sidhu: Yeah, I mean, I love that, right? It's, uh, it's, you know, creating a 3D experience is one thing, distributing it at scale and having a huge audience of folks that can experience it from a plurality of devices, I think is equally key. Um, you know, a lot of people talked about the metaverse and kind of equate it with AR VR headsets, but I've always been a fan of the definition of like, the metaverse needs to be AR VR optional. So why not include that low-end Android device as much as like a kitted-out PC that somebody may have or a headset in the future?

Anupam Singh: And the technology to enable that, right? If you have a 2 Giga phone or you have a network connection that is not strong, and you still want to play one of our games, downsampling it, upsampling it, all of that is infrastructure that we want it to be invisible both to our players and creators.

Bilawal Sidhu: Cool.

Anupam Singh: Yeah, I've been on calls with some of our top creators, and they sometimes are curious on what happens after they hit publish. And I want to tell them, that's where our challenge starts because some of the creators are able to get two or three million people into their events. And imagine two or three million people are pressing play at the same time. And you have to distribute this new update to 40,000 servers worldwide across data centers, match you with your friends, and get you inside the game because your patience will last not more than three seconds after you press the play button. So much to your point earlier, Bilawal, it is much more complicated than video because video is one way, whereas if you and I are playing Roblox, I have to make sure that we are synchronized and we are having a great experience irrespective of whether I am on a PC and you are on an Android device.

Bilawal Sidhu: Absolutely. But let's be honest, the metaverse would be a rather empty place without interesting content. So what is Roblox doing to make it easier to populate these virtual worlds with amazing 3D content?

Anupam Singh: The first one is invisible infrastructure so that people don't have to worry about where, where do the bits go. Um, second one is matching you to your audience. So it starts with matchmaking, which is the ability after you press play to put you into the right instance. But a lot of our machine learning and AI work is related to, um, uh, discovery and recommendations, whether you are in the marketplace to buy the latest avatar or whether you are on a homepage trying to figure out what next game you want to play. But one of the core values for us, and that's why I'm so proud about working at Roblox, is safety. Most of the people when they think about ML and AI, they think about recommendations, they think about monetization. But our heavy investment is in safety.

Bilawal Sidhu: Is that moderation?

Anupam Singh: Yeah, it could be, let's take the basic stuff. You and I are chatting on the platform. Uh, every one of the words that y

Comments 
loading
00:00
00:00
1.0x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Roblox’s Cube Model: Creating Interactive 4D Worlds | VP of AI Explains

Roblox’s Cube Model: Creating Interactive 4D Worlds | VP of AI Explains

Bilawal Sidhu